feat(plugin-hitl): add human-in-the-loop tool approval flow by exoticknight · Pull Request #23 · moeru-ai/apeira

exoticknight · 2026-05-27T07:22:37Z

IMPOTENT: rely on moeru-ai/xsai#309

feat

Add core resolveToolCallHooks support that bridges plugin preToolCall / postToolCall hooks into xsAI response execution.
Add tool-interruption Apeira events and exported HITL hook/interruption types for host UIs.
Add per-plugin private session state that is snapshotted, restored, forked, and passed into plugin hooks.
Introduce @apeira/plugin-hitl with allow/ask/deny/off modes, policy callbacks, pending request tracking, call/run/conversation approval scopes, model-visible rejection results, and optional tool metadata.
Export @apeira/plugin-hitl through the umbrella apeira package.
Teach @apeira/plugin-ag-ui to map HITL interruptions into HITL_REVIEW_REQUIRED tool results for review-card rendering.
Add deterministic HITL demo flows for CopilotKit and pi-tui, including replay fetch helpers, mock demo tools, approval/rejection paths, CopilotKit review cards, and pi-tui /approve, /reject, /demo, and /hitl commands.
Add a Chat SDK Telegram example that streams Apeira agent text deltas into chat threads with in-memory Chat SDK state and persisted Apeira sessions.

refactor

Rename the core integration from tool-call control decisions to tool-call transformer hooks, matching the xsAI preToolCall / postToolCall API.
Share HITL demo runtime helpers between examples under examples/shared/hitl-demo.ts.
Extract pi-tui transcript append and HITL demo transcript helpers into hitl-demo-transcript.ts.

chore

Add Vitest coverage for the HITL plugin, core plugin private state, AG-UI interruption mapping, CopilotKit HITL runtime/demo behavior, and pi-tui demo helpers.
Add package/workspace wiring for @apeira/plugin-hitl, Chat SDK dependencies, example scripts, lockfile updates, and CopilotKit HITL env prefix support.
Bump xsAI catalog packages to the HITL beta build required by tool-call hooks.
Add the Chat SDK skill metadata and ignore .apeira example session storage.
Add common-tools planning artifacts under docs/plans for approval policy and roadmap follow-up work.

review aids

Core Hook Flow

sequenceDiagram
  participant Model as xsAI responses
  participant Core as Apeira core
  participant Plugin as Plugin hooks
  participant HITL as @apeira/plugin-hitl
  participant Host as UI / TUI host
  participant Tool as Tool executor

  Model->>Core: function_call
  Core->>Plugin: preToolCall(toolCall, executeOptions)
  Plugin->>HITL: inspect tool key + policy

  alt allowed by mode/policy/scope
    HITL-->>Plugin: void
    Plugin-->>Core: void or transformed toolCall
    Core->>Tool: execute(args)
    Tool-->>Core: tool result
    Core->>Plugin: postToolCall(toolResult)
    Core-->>Model: function_call_output
  else denied
    HITL-->>Plugin: synthetic CompletionToolResult
    Plugin-->>Core: TOOL_HITL_REJECTED result
    Core-->>Model: function_call_output
  else human approval required
    HITL-->>Host: tool-interruption event
    HITL-->>Plugin: pending promise
    Host->>HITL: approve(id, scope) or reject(id)
    HITL-->>Plugin: continue or synthetic rejection result
    Plugin-->>Core: resolved hook result
  end

Approval Scope Model

flowchart TD
  ToolCall["Tool call key = toolName + args"] --> Mode{HITL mode}
  Mode -->|allow/off| Continue[Continue tool execution]
  Mode -->|deny| Reject[Return synthetic rejection result]
  Mode -->|ask| Existing{Existing approval?}

  Existing -->|conversation private state| Continue
  Existing -->|run allow set| Continue
  Existing -->|none| Interrupt[Emit tool-interruption]

  Interrupt --> Human{Human decision}
  Human -->|approve call| Continue
  Human -->|approve run| RunAllow[Remember key until turn done]
  Human -->|approve conversation| ConversationAllow[Persist key in plugin private state]
  Human -->|reject| Reject

  RunAllow --> Continue
  ConversationAllow --> Continue
  RunAllow --> TurnDone[onTurnDone clears run approvals]

Package Wiring

flowchart LR
  Core["@apeira/core"]
  HITL["@apeira/plugin-hitl"]
  AGUI["@apeira/plugin-ag-ui"]
  Apeira["@apeira/apeira"]
  Shared["examples/shared/hitl-demo.ts"]
  Copilot["examples/copilotkit"]
  PiTui["examples/pi-tui"]
  ChatSDK["examples/chat-sdk"]

  Core -->|ToolCallHooks + privateState| HITL
  Core -->|tool-interruption event| AGUI
  HITL -->|umbrella export| Apeira
  HITL --> Copilot
  HITL --> PiTui
  AGUI -->|HITL_REVIEW_REQUIRED result| Copilot
  Shared --> Copilot
  Shared --> PiTui
  Core --> ChatSDK

Example Runtime Flows

flowchart TD
  DemoStart["User starts HITL demo"] --> Replay["createHitlReplayFetch returns deterministic response"]
  Replay --> FunctionCall["Assistant emits function_call"]
  FunctionCall --> Hook["Apeira runs preToolCall hook"]
  Hook --> Review["plugin-hitl emits tool-interruption"]
  Review --> HostUI{"Host surface"}
  HostUI -->|CopilotKit| ReviewCard["Render review card"]
  HostUI -->|pi-tui| SlashCommands["Show /approve and /reject commands"]
  ReviewCard --> Decision["approve/reject controller call"]
  SlashCommands --> Decision
  Decision --> Resume["Pending preToolCall promise resolves"]
  Resume --> ExecuteOrReject{"Approved?"}
  ExecuteOrReject -->|yes| ToolExec["Execute mock tool"]
  ExecuteOrReject -->|no| Rejection["Return TOOL_HITL_REJECTED result"]

Suggested Review Order

Step	Focus	Files
1	Core public contract and hook bridging	`packages/core/src/types/plugin.ts`, `packages/core/src/types/event.ts`, `packages/core/src/utils/turn-runner.ts`
2	Plugin private session state semantics	`packages/core/src/utils/agent-runtime.ts`, `packages/core/src/utils/agent.ts`, `packages/core/test/utils/agent.test.ts`
3	HITL approval/rejection behavior	`packages/plugin-hitl/src/index.ts`, `packages/plugin-hitl/test/index.test.ts`, `packages/plugin-hitl/README.md`
4	AG-UI interruption mapping	`packages/plugin-ag-ui/src/index.ts`, `packages/plugin-ag-ui/test/index.test.ts`
5	Demo surfaces	`examples/shared/hitl-demo.ts`, `examples/copilotkit/src/components/chat-panel.tsx`, `examples/pi-tui/src/app.ts`
6	Chat SDK example and workspace wiring	`examples/chat-sdk/*`, `.agents/skills/chat-sdk/SKILL.md`, `package.json`, `pnpm-workspace.yaml`, `pnpm-lock.yaml`

gemini-code-assist

Code Review

This pull request introduces human-in-the-loop (HITL) tool control capabilities for Apeira agents, adding a new @apeira/plugin-hitl package, updating core runtime types to support tool interruption and private plugin state, and integrating these features into the CopilotKit and Pi-TUI examples.

Three important issues were identified in the review: a critical persistence bug in @apeira/plugin-hitl where out-of-turn approvals update a stale private state closure and fail to persist; a high-severity parsing bug in CopilotKit's parseHitlReview where multi-line JSON arguments containing newlines break the simple line-splitting logic; and a medium-severity opportunity to canonicalize JSON arguments in getToolKey to prevent identical tool calls with different key orderings from bypassing conversation-scope approvals.

gemini-code-assist · 2026-05-27T07:24:13Z

+export const hitl = (options: HITLPluginOptions = {}): HITLController => {
+  let mode: HITLMode = options.mode ?? 'ask'
+  const callAllows = new Set<string>()
+  const pending = new Map<string, HITLPendingRequest>()
+  const pendingState = new Map<string, PluginPrivateStateApi<HITLPrivateState>>()
+  const resumeDecisions = new Map<string, HITLResumeDecision>()
+  const rejectedKeys = new Map<string, string>()
+  const runAllows = new Set<string>()
+
+  const resolveDecision = async (context: HITLPolicyContext): Promise<HITLPolicyDecision> => {
+    if (mode === 'off' || mode === 'allow')
+      return { type: 'allow' }
+    if (mode === 'deny')
+      return { type: 'deny' }
+
+    return await options.policy?.(context) ?? { type: 'ask' }
+  }
+
+  const createControl = (privateState: PluginPrivateStateApi<HITLPrivateState>): ToolCallControl => ({
+    preToolCall: async (context) => {
+      const key = getToolKey(context.toolCall)
+
+      if (mode === 'deny') {
+        return {
+          message: 'TOOL_HITL_REJECTED',
+          type: 'reject',
+        }
+      }
+
+      const rejectedMessage = rejectedKeys.get(key)
+      if (rejectedMessage != null) {
+        rejectedKeys.delete(key)
+        return {
+          message: rejectedMessage,
+          type: 'reject',
+        }
+      }
+
+      if (callAllows.has(key)) {
+        callAllows.delete(key)
+        return { type: 'continue' }
+      }
+
+      if (runAllows.has(key) || hasConversationAllow(privateState, key))
+        return { type: 'continue' }
+
+      const decision = await resolveDecision({
+        ...context,
+        key,
+        metadata: metadata.get(context.tool),
+      })
+
+      if (decision.type === 'allow')
+        return { type: 'continue' }
+
+      if (decision.type === 'deny') {
+        return {
+          message: decision.message ?? 'TOOL_HITL_REJECTED',
+          type: 'reject',
+        }
+      }
+
+      const interruption: ToolInterruption = {
+        data: {
+          key,
+          metadata: metadata.get(context.tool),
+        },
+        id: `hitl_${context.toolCall.id}`,
+        reason: decision.message ?? 'Human review required.',
+        toolCall: context.toolCall,
+      }
+
+      pending.set(interruption.id, {
+        id: interruption.id,
+        key,
+        metadata: metadata.get(context.tool),
+        reason: interruption.reason,
+        toolCall: context.toolCall,
+        toolName: context.tool.function.name,
+      })
+      pendingState.set(interruption.id, privateState)
+
+      return {
+        interruption,
+        type: 'interrupt',
+      }
+    },
+  })
+
+  const approve: HITLController['approve'] = (id, scope = options.scope ?? 'call') => {
+    const request = pending.get(id)
+    if (request == null)
+      return false
+
+    if (scope === 'call') {
+      callAllows.add(request.key)
+    }
+    else if (scope === 'run') {
+      runAllows.add(request.key)
+    }
+    else {
+      const state = pendingState.get(id)
+      if (state == null)
+        return false
+
+      setConversationAllow(state, request.key)
+    }
+
+    resumeDecisions.set(id, { id, scope, type: 'approved' })
+    pending.delete(id)
+    pendingState.delete(id)
+    return true
+  }
+
+  const reject: HITLController['reject'] = (id, message = 'TOOL_HITL_REJECTED') => {
+    const request = pending.get(id)
+    if (request == null)
+      return false
+
+    rejectedKeys.set(request.key, message)
+    resumeDecisions.set(id, {
+      id,
+      key: request.key,
+      message,
+      type: 'rejected',
+    })
+    pendingState.delete(id)
+    return pending.delete(id)
+  }
+
+  return {
+    approve,
+    clear: () => {
+      pending.clear()
+      pendingState.clear()
+      callAllows.clear()
+      rejectedKeys.clear()
+      resumeDecisions.clear()
+      runAllows.clear()
+    },
+    getDecisionForResume: id => resumeDecisions.get(id),
+    pending: () => [...pending.values()],
+    plugin: {
+      name,
+      onTurnDone: () => {
+        runAllows.clear()
+      },
+      resolveToolCallControl: ({ privateState }) => {
+        if (privateState == null)
+          throw new Error('@apeira/plugin-hitl requires plugin private state support from @apeira/core.')
+
+        return createControl(privateState)
+      },
+      version,
+    },
+    reject,
+    setMode: nextMode => mode = nextMode,
+  }
+}


Critical Persistence Bug: Stale Closure on Private State

When approve(id, 'conversation') is called, it retrieves the PluginPrivateStateApi from the pendingState map and updates it. However, because approve is called asynchronously after the turn has already completed, the turn's state has already been serialized and committed to storage. Any updates to this stale privateState closure reference are lost and never persisted to the session database or localStorage.

Solution

Instead of updating the private state immediately during the out-of-turn approve call, we can store the pending conversation approvals in an in-memory pendingConversationAllows set. When the next turn starts (e.g., the resume turn), resolveToolCallControl is called with a fresh, active privateState reference. We can then commit all pending conversation approvals to this active state, ensuring they are correctly serialized and persisted when the turn finishes.

export const hitl = (options: HITLPluginOptions = {}): HITLController => { let mode: HITLMode = options.mode ?? 'ask' const callAllows = new Set<string>() const pending = new Map<string, HITLPendingRequest>() const pendingConversationAllows = new Set<string>() const resumeDecisions = new Map<string, HITLResumeDecision>() const rejectedKeys = new Map<string, string>() const runAllows = new Set<string>() const resolveDecision = async (context: HITLPolicyContext): Promise<HITLPolicyDecision> => { if (mode === 'off' || mode === 'allow') return { type: 'allow' } if (mode === 'deny') return { type: 'deny' } return await options.policy?.(context) ?? { type: 'ask' } } const createControl = (privateState: PluginPrivateStateApi<HITLPrivateState>): ToolCallControl => ({ preToolCall: async (context) => { const key = getToolKey(context.toolCall) if (mode === 'deny') { return { message: 'TOOL_HITL_REJECTED', type: 'reject', } } const rejectedMessage = rejectedKeys.get(key) if (rejectedMessage != null) { rejectedKeys.delete(key) return { message: rejectedMessage, type: 'reject', } } if (callAllows.has(key)) { callAllows.delete(key) return { type: 'continue' } } if (runAllows.has(key) || hasConversationAllow(privateState, key)) return { type: 'continue' } const decision = await resolveDecision({ ...context, key, metadata: metadata.get(context.tool), }) if (decision.type === 'allow') return { type: 'continue' } if (decision.type === 'deny') { return { message: decision.message ?? 'TOOL_HITL_REJECTED', type: 'reject', } } const interruption: ToolInterruption = { data: { key, metadata: metadata.get(context.tool), }, id: `hitl_${context.toolCall.id}`, reason: decision.message ?? 'Human review required.', toolCall: context.toolCall, } pending.set(interruption.id, { id: interruption.id, key, metadata: metadata.get(context.tool), reason: interruption.reason, toolCall: context.toolCall, toolName: context.tool.function.name, }) return { interruption, type: 'interrupt', } }, }) const approve: HITLController['approve'] = (id, scope = options.scope ?? 'call') => { const request = pending.get(id) if (request == null) return false if (scope === 'call') { callAllows.add(request.key) } else if (scope === 'run') { runAllows.add(request.key) } else { pendingConversationAllows.add(request.key) } resumeDecisions.set(id, { id, scope, type: 'approved' }) pending.delete(id) return true } const reject: HITLController['reject'] = (id, message = 'TOOL_HITL_REJECTED') => { const request = pending.get(id) if (request == null) return false rejectedKeys.set(request.key, message) resumeDecisions.set(id, { id, key: request.key, message, type: 'rejected', }) return pending.delete(id) } return { approve, clear: () => { pending.clear() pendingConversationAllows.clear() callAllows.clear() rejectedKeys.clear() resumeDecisions.clear() runAllows.clear() }, getDecisionForResume: id => resumeDecisions.get(id), pending: () => [...pending.values()], plugin: { name, onTurnDone: () => { runAllows.clear() }, resolveToolCallControl: ({ privateState }) => { if (privateState == null) throw new Error('@apeira/plugin-hitl requires plugin private state support from @apeira/core.') if (pendingConversationAllows.size > 0) { for (const key of pendingConversationAllows) { setConversationAllow(privateState, key) } pendingConversationAllows.clear() } return createControl(privateState) }, version, }, reject, setMode: nextMode => mode = nextMode, } }

gemini-code-assist · 2026-05-27T07:24:13Z

+const parseHitlReview = (result: string | undefined): HitlReview | undefined => {
+  if (result == null || !result.includes('HITL_REVIEW_REQUIRED'))
+    return undefined
+
+  const lines = Object.fromEntries(result.split('\n').flatMap((line) => {
+    const [key, ...rest] = line.split('=')
+    return key == null || rest.length === 0 ? [] : [[key, rest.join('=')]]
+  }))
+
+  if (lines.id == null)
+    return undefined
+
+  return {
+    args: lines.args ?? '{}',
+    id: lines.id,
+    reason: lines.reason ?? 'Human review required.',
+    tool: lines.tool ?? 'tool',
+  }
+}


Multi-line JSON Parsing Bug

Splitting the raw result string by \n and then by = is extremely fragile. If the args parameter contains a formatted JSON string with newlines (which is common for complex tool arguments), splitting by \n will break the JSON structure into incomplete lines, causing JSON.parse to throw an error and rendering the review card broken.

Solution

Use a regular expression with the multiline (m) and dotall (s) flags to robustly extract the values of id, tool, args, and reason even if they span multiple lines.

Suggested change

const parseHitlReview = (result: string | undefined): HitlReview | undefined => {

if (result == null || !result.includes('HITL_REVIEW_REQUIRED'))

return undefined

const lines = Object.fromEntries(result.split('\n').flatMap((line) => {

const [key, ...rest] = line.split('=')

return key == null || rest.length === 0 ? [] : [[key, rest.join('=')]]

}))

if (lines.id == null)

return undefined

return {

args: lines.args ?? '{}',

id: lines.id,

reason: lines.reason ?? 'Human review required.',

tool: lines.tool ?? 'tool',

}

}

const parseHitlReview = (result: string | undefined): HitlReview | undefined => {

if (result == null || !result.includes('HITL_REVIEW_REQUIRED'))

return undefined

const idMatch = result.match(/^id=(.*)$/m)

const toolMatch = result.match(/^tool=(.*)$/m)

const argsMatch = result.match(/^args=([\s\S]*?)(?=\nreason=|\n$|$)/m)

const reasonMatch = result.match(/^reason=([\s\S]*)$/m)

if (idMatch == null)

return undefined

return {

args: argsMatch ? argsMatch[1].trim() : '{}',

id: idMatch[1].trim(),

reason: reasonMatch ? reasonMatch[1].trim() : 'Human review required.',

tool: toolMatch ? toolMatch[1].trim() : 'tool',

}

}

gemini-code-assist · 2026-05-27T07:24:13Z

+const getToolKey = (toolCall: ToolCall) =>
+  JSON.stringify([
+    toolCall.function.name,
+    toolCall.function.arguments ?? '',
+  ])


Improvement: Canonicalize JSON Arguments

LLMs can occasionally generate identical tool arguments with different key ordering or whitespace formatting. Since getToolKey currently stringifies the raw arguments string directly, semantically identical calls could produce different keys, bypassing existing conversation-scope approvals.

Solution

Parse the arguments JSON and recursively sort the keys before stringifying to ensure a canonical representation.

const getToolKey = (toolCall: ToolCall) => { let args = toolCall.function.arguments ?? '' try { const parsed = JSON.parse(args) const canonicalize = (obj: any): any => { if (Array.isArray(obj)) return obj.map(canonicalize) if (obj !== null && typeof obj === 'object') { return Object.keys(obj).sort().reduce((acc: any, key) => { acc[key] = canonicalize(obj[key]) return acc }, {}) } return obj } args = JSON.stringify(canonicalize(parsed)) } catch {} return JSON.stringify([ toolCall.function.name, args, ]) }

- Introduced @apeira/plugin-hitl for managing tool interruptions and approvals. - Updated core types to support tool interruption events and private state management. - Enhanced agent runtime to handle plugin-specific private states during turn processing. - Modified session store to maintain plugin private state across context updates. - Implemented tests for HITL plugin functionality, including approval and rejection scenarios. - Updated package dependencies and workspace configuration to include new HITL plugin.

- Introduced HTML and Markdown versions of the Plugin Common Tools roadmap. - Documented phases of development, including goals, status, scope, tasks, and acceptance criteria. - Established a status legend for tracking progress on each phase. - Highlighted the need for discussions on approval policies affecting multiple phases.

exoticknight · 2026-05-27T12:43:07Z

/gemini

…L plugin

exoticknight requested a review from kwaa May 27, 2026 07:22

exoticknight self-assigned this May 27, 2026

gemini-code-assist Bot reviewed May 27, 2026

View reviewed changes

exoticknight added 2 commits May 27, 2026 20:40

exoticknight force-pushed the codex/plugin-hitl branch from a26560e to 7f49375 Compare May 27, 2026 12:41

exoticknight changed the title ~~feat(hitl): add human-in-the-loop tool control plugin~~ feat(plugin-hitl): add human-in-the-loop tool approval flow May 27, 2026

refactor: update tool call control to tool call hooks in core and HIT…

0b290a0

…L plugin

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(plugin-hitl): add human-in-the-loop tool approval flow#23

feat(plugin-hitl): add human-in-the-loop tool approval flow#23
exoticknight wants to merge 3 commits into
mainfrom
codex/plugin-hitl

exoticknight commented May 27, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 27, 2026

Uh oh!

gemini-code-assist Bot May 27, 2026

Uh oh!

gemini-code-assist Bot May 27, 2026

Uh oh!

exoticknight commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

exoticknight commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

feat

refactor

chore

review aids

Core Hook Flow

Approval Scope Model

Package Wiring

Example Runtime Flows

Suggested Review Order

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 27, 2026

Choose a reason for hiding this comment

Critical Persistence Bug: Stale Closure on Private State

Solution

Uh oh!

gemini-code-assist Bot May 27, 2026

Choose a reason for hiding this comment

Multi-line JSON Parsing Bug

Solution

Uh oh!

gemini-code-assist Bot May 27, 2026

Choose a reason for hiding this comment

Improvement: Canonicalize JSON Arguments

Solution

Uh oh!

exoticknight commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

exoticknight commented May 27, 2026 •

edited

Loading